XD1 SIG report
 
XD1 SIG Report
Chair Liam Forbes
CUG 2007
 
The CUG 2007 XD1 Interactive Session was reasonably attended.  There
were 5 sites (ARSC, NRL, SMDC, ORNL, and Rice) and 4 Cray Inc. liasons.
The discussion agenda was:
 
1. Introductions and Reminders
    SIG Chairs: Liam Forbes (Systems) and Scott Kneller (DOD)
    Cray Liaisons: Cindy Nuss (Applications), Luiz DeRose (Programming Environments), Nola Van Vugt (User Services), Jim Harrell (Systems &   Integration), Charlie Carroll (Operating Systems), and Charlie Clark   (Operations)
2. Identify XD1 Related Presentations and BoFs
    Monday: 4pm (Emerald III)  XD1 Interactive Session by CUG & Cray
    Tuesday: 11am (San Juan/Whidbey), XD1 Extreme
    Application Performance Profiling on the Cray XD1 Using HPC Toolkit by Jan Odegard
 
  The Naval Research Laboratory Cray XD1 by Marco Lanzagorta
3. Cray Inc. Comments Re XD1
4. Open Discussion
5. Solicit Comments on Re-Structuring the SIGs
 
The participants spent 40 minutes discussing the state of the XD1 and current support activities.  Everybody understood that this is not a current product, but of the members that participated, several thought their XD1 would probably be in use for another 2 or 3 years.  It's generally a stable system that produces useful cycles.
 
Luiz DeRose (Cray Inc.) reported that an update for Apprentice and CrayPat will be released in the next month or so.  This will probably be the last release that includes new features for the XD1.
 
Jan Newton (Rice) solicited whether or not other sites were seeing issues with the L2Forwarding feature that requires them to reboot the entire system to move nodes in and out of partitions.  Nobody else has seen that particular problem, but Liam Forbes (ARSC) mentioned that their system is suffering from base node reboots during times of high network utilization that includes the L2Forwarding.  Don (Cray) took notes and  will bring the questions/comments back to Cray.  There are cases open  on these issues by the individual sites.
 
Andrey K. (??) emailed a question to us asking if a MPICH2 port was started for the XD1 and discontinued by Cray.  If so, could Cray either finish the port or release it for an interested site or user to complete.  Luiz DeRose said he would look into what might have been done, and if anything what Cray's policy on maybe releasing it to the community for completion.
 
Jeanie Osburn (NRL) brought up a problem they are having with incomplete or incomprehensible error messages being generated by the mpiexec command.  There is a case open and a possible update in the future.
 
Everyone was and is encouraged to send comments and queries to the XD1 mailing list.  Especially questions to find out if other sites are seeing similar problems would be helpful to members and Cray. Nobody is really sure why the list isn't being used, but would like it to be.
 
For the last 15 minutes we discussed how CUG might re-organize the SIG structure for next year.  Most folks agreed that having the SIGs mirror the product lines is not working to increase attendance and participation. There is probably some functional grouping, or a combination of functions and architectures that would work better.  However, everyone was in favor of having at least an XD1 BoF session at the 2008 CUG.